及时调整是将预训练的语言模型调整为下游任务的一种新兴方法。但是,现有的研究主要是为输入序列增加提示。由于中间多头自我注意和馈送网络计算,因此这种方式无法正常工作,从而使模型优化不是很好。因此,我们提出了一种称为“图层调整”的新颖调整方式,旨在在变压器层中添加可学习的参数。具体而言,我们专注于变压器中的馈电网络的图层调整,即FLANing。它将其他单元引入每个馈送网络的隐藏层。我们对公共线索基准进行了广泛的实验。结果表明:1)在几乎所有情况下,我们的FL-tuning tospormports促进了全数据和少量设置下的调整方法。特别是,它在WSC 1.0上的准确性提高了17.93%(全数据设置),而F1上的精度则提高了P-Tuning V2上的Cluener上的精度(几乎没有射击设置)。 2)我们的FL-调整更稳定,收敛速度比P-Tuning V2快约1.17倍。 3)只有大约3%的变压器参数要训练,因此在大多数数据集中进行了微调,并且在几个数据集上的微调(例如,WSC 1.1上的准确性提高了12.9%)。源代码可从https://github.com/genggui001/fl-tuning获得。
translated by 谷歌翻译
本文旨在通过探索基于神经网络的方法(称为Sun)中的内在不确定性来提高文本到SQL解析的性能。从数据不确定性的角度来看,可以从多个语义等效的问题中学到单个SQL。从以前仅限于一对一映射的方法中不同,我们提出了一个数据不确定性限制来探索潜在的互补语义语义多个语义等效问题(多对一)中的信息,并以减少的虚假关联来学习稳健的特征表示。通过这种方式,我们可以降低学习表示的敏感性并改善解析器的鲁棒性。从模型的不确定性角度来看,神经网络的权重之间通常存在结构信息(依赖性)。为了提高神经文本到SQL解析器的普遍性和稳定性,我们提出了模型不确定性约束,以通过强制执行不同扰动编码网络的输出表示形式来完善查询表示形式,以使其彼此一致。在五个基准数据集上进行的广泛实验表明,我们的方法显着优于强大的竞争对手,并实现了新的最新结果。为了获得可重复性,我们在https://github.com/alibabaresearch/damo-convai/tree/main/main/sunsql上发布代码和数据。
translated by 谷歌翻译
文本到SQL解析是一项必不可少且具有挑战性的任务。文本到SQL解析的目的是根据关系数据库提供的证据将自然语言(NL)问题转换为其相应的结构性查询语言(SQL)。来自数据库社区的早期文本到SQL解析系统取得了显着的进展,重度人类工程和用户与系统的互动的成本。近年来,深层神经网络通过神经生成模型显着提出了这项任务,该模型会自动学习从输入NL问题到输出SQL查询的映射功能。随后,大型的预训练的语言模型将文本到SQL解析任务的最新作品带到了一个新级别。在这项调查中,我们对文本到SQL解析的深度学习方法进行了全面的评论。首先,我们介绍了文本到SQL解析语料库,可以归类为单转和多转。其次,我们提供了预先训练的语言模型和现有文本解析方法的系统概述。第三,我们向读者展示了文本到SQL解析所面临的挑战,并探索了该领域的一些潜在未来方向。
translated by 谷歌翻译
长期以来,可以将可以应用于新数据库的文本到SQL解析器的重要性已得到认可,实现此目标的关键步骤是架构链接,即在生成SQL时正确地识别未见列或表的提及。在这项工作中,我们提出了一个新颖的框架,以通过基于PoinCar \'e距离指标的探测程序从大规模预训练的语言模型(PLM)中引起关系结构,并使用诱导的关系来增强基于图的解析器为了更好的模式链接。与常用的基于规则的架构链接方法相比,我们发现探测关系也可以稳健地捕获语义对应关系,即使提及和实体的表面形式不同。此外,我们的探测过程完全不受监督,不需要其他参数。广泛的实验表明,我们的框架在三个基准测试中设定了新的最新性能。我们从经验上验证我们的探测程序确实可以通过定性分析找到所需的关系结构。
translated by 谷歌翻译
最近训练模型通过利用大规模文本语料库来改善神经网络的上下文表示能力,显着提高了各种NLP任务的性能。大型预培训语言模型也已应用于表语义解析的区域。然而,现有的预训练方法没有仔细探索问题与相应的数据库模式之间的明确互动关系,这是揭示其语义和结构对应的关键成分。此外,在架构接地背景下的问知表示学习在预训练目标中受到更少的关注。为了减轻这些问题,本文设计了两种新的预训练目标,将所需的归纳偏差将所需的归纳偏差施加到表前的学习表现-训练。我们进一步提出了一种模式感知课程学习方法来减轻噪声的影响,并以易于努力的方式从预训练数据中学习。我们通过在两个基准,蜘蛛和罢工中进行微调,评估我们预先接受训练的框架。结果表明,与各种基线相比,我们的预训练目标和课程的有效性。
translated by 谷歌翻译
Deep learning models can achieve high accuracy when trained on large amounts of labeled data. However, real-world scenarios often involve several challenges: Training data may become available in installments, may originate from multiple different domains, and may not contain labels for training. Certain settings, for instance medical applications, often involve further restrictions that prohibit retention of previously seen data due to privacy regulations. In this work, to address such challenges, we study unsupervised segmentation in continual learning scenarios that involve domain shift. To that end, we introduce GarDA (Generative Appearance Replay for continual Domain Adaptation), a generative-replay based approach that can adapt a segmentation model sequentially to new domains with unlabeled data. In contrast to single-step unsupervised domain adaptation (UDA), continual adaptation to a sequence of domains enables leveraging and consolidation of information from multiple domains. Unlike previous approaches in incremental UDA, our method does not require access to previously seen data, making it applicable in many practical scenarios. We evaluate GarDA on two datasets with different organs and modalities, where it substantially outperforms existing techniques.
translated by 谷歌翻译
The development of social media user stance detection and bot detection methods rely heavily on large-scale and high-quality benchmarks. However, in addition to low annotation quality, existing benchmarks generally have incomplete user relationships, suppressing graph-based account detection research. To address these issues, we propose a Multi-Relational Graph-Based Twitter Account Detection Benchmark (MGTAB), the first standardized graph-based benchmark for account detection. To our knowledge, MGTAB was built based on the largest original data in the field, with over 1.55 million users and 130 million tweets. MGTAB contains 10,199 expert-annotated users and 7 types of relationships, ensuring high-quality annotation and diversified relations. In MGTAB, we extracted the 20 user property features with the greatest information gain and user tweet features as the user features. In addition, we performed a thorough evaluation of MGTAB and other public datasets. Our experiments found that graph-based approaches are generally more effective than feature-based approaches and perform better when introducing multiple relations. By analyzing experiment results, we identify effective approaches for account detection and provide potential future research directions in this field. Our benchmark and standardized evaluation procedures are freely available at: https://github.com/GraphDetec/MGTAB.
translated by 谷歌翻译
As one of the prevalent methods to achieve automation systems, Imitation Learning (IL) presents a promising performance in a wide range of domains. However, despite the considerable improvement in policy performance, the corresponding research on the explainability of IL models is still limited. Inspired by the recent approaches in explainable artificial intelligence methods, we proposed a model-agnostic explaining framework for IL models called R2RISE. R2RISE aims to explain the overall policy performance with respect to the frames in demonstrations. It iteratively retrains the black-box IL model from the randomized masked demonstrations and uses the conventional evaluation outcome environment returns as the coefficient to build an importance map. We also conducted experiments to investigate three major questions concerning frames' importance equality, the effectiveness of the importance map, and connections between importance maps from different IL models. The result shows that R2RISE successfully distinguishes important frames from the demonstrations.
translated by 谷歌翻译
Compressed videos often exhibit visually annoying artifacts, known as Perceivable Encoding Artifacts (PEAs), which dramatically degrade video visual quality. Subjective and objective measures capable of identifying and quantifying various types of PEAs are critical in improving visual quality. In this paper, we investigate the influence of four spatial PEAs (i.e. blurring, blocking, bleeding, and ringing) and two temporal PEAs (i.e. flickering and floating) on video quality. For spatial artifacts, we propose a visual saliency model with a low computational cost and higher consistency with human visual perception. In terms of temporal artifacts, self-attention based TimeSFormer is improved to detect temporal artifacts. Based on the six types of PEAs, a quality metric called Saliency-Aware Spatio-Temporal Artifacts Measurement (SSTAM) is proposed. Experimental results demonstrate that the proposed method outperforms state-of-the-art metrics. We believe that SSTAM will be beneficial for optimizing video coding techniques.
translated by 谷歌翻译
We propose a distributionally robust return-risk model for Markov decision processes (MDPs) under risk and reward ambiguity. The proposed model optimizes the weighted average of mean and percentile performances, and it covers the distributionally robust MDPs and the distributionally robust chance-constrained MDPs (both under reward ambiguity) as special cases. By considering that the unknown reward distribution lies in a Wasserstein ambiguity set, we derive the tractable reformulation for our model. In particular, we show that that the return-risk model can also account for risk from uncertain transition kernel when one only seeks deterministic policies, and that a distributionally robust MDP under the percentile criterion can be reformulated as its nominal counterpart at an adjusted risk level. A scalable first-order algorithm is designed to solve large-scale problems, and we demonstrate the advantages of our proposed model and algorithm through numerical experiments.
translated by 谷歌翻译